Pruning Techniques for Mixed Ensembles of Genetic Programming Models
نویسندگان
چکیده
The objective of this paper is to define an effective strategy for building an ensemble of Genetic Programming (GP) models. Ensemble methods are widely used in machine learning due to their features: they average out biases, they reduce the variance and they usually generalize better than single models. Despite these advantages, building ensemble of GP models is not a well-developed topic in the evolutionary computation community. To fill this gap, we propose a strategy that blends individuals produced by standard syntax-based GP and individuals produced by geometric semantic genetic programming, one of the newest semantics-based method developed in GP. In fact, recent literature showed that combining syntax and semantics could improve the generalization ability of a GP model. Additionally, to improve the diversity of the GP models used to build up the ensemble, we propose different pruning criteria that are based on correlation and entropy, a commonly used measure in information theory. Experimental results, obtained over different complex problems, suggest that the pruning criteria based on correlation and entropy could be effective in improving the generalization ability of the ensemble model and in reducing the computational burden required to build it.
منابع مشابه
Ensemble Learning and Pruning in Multi-Objective Genetic Programming for Classification with Unbalanced Data
Machine learning algorithms can suffer a performance bias when data sets are unbalanced. This paper develops a multi-objective genetic programming approach to evolving accurate and diverse ensembles of non-dominated solutions where members vote on class membership. We explore why the ensembles can also be vulnerable to the learning bias using a range of unbalanced data sets. Based on the notion...
متن کاملGlobal Supply Chain Management under Carbon Emission Trading Program Using Mixed Integer Programming and Genetic Algorithm
In this paper, the transportation problem under the carbon emission trading program ismodelled by mathematical programming and genetic algorithm. Since green supply chain issuesbecome important and new legislations are taken into account, carbon emissions costs are included inthe total costs of the supply chain. The optimisation model has the ability to minimise the total costsand provides the ...
متن کاملPruning GP-Based Classifier Ensembles by Bayesian Networks
Classifier ensemble techniques are effectively used to combine the responses provided by a set of classifiers. Classifier ensembles improve the performance of single classifier systems, even if a large number of classifiers is often required. This implies large memory requirements and slow speeds of classification, making their use critical in some applications. This problem can be reduced by s...
متن کاملMathematical Programming Models for Solving Unequal-Sized Facilities Layout Problems - a Generic Search Method
This paper present unequal-sized facilities layout solutions generated by a genetic search program named LADEGA (Layout Design using a Genetic Algorithm). The generalized quadratic assignment problem requiring pre-determined distance and material flow matrices as the input data and the continuous plane model employing a dynamic distance measure and a material flow matrix are discussed. Computa...
متن کاملComparing Mixed-Integer and Constraint Programming for the No-Wait Flow Shop Problem with Due Date Constraints
The impetus for this research was examining a flow shop problem in which tasks were expected to be successively carried out with no time interval (i.e., no wait time) between them. For this reason, they should be completed by specific dates or deadlines. In this regard, the efficiency of the models was evaluated based on makespan. To solve the NP-Hard problem, we developed two mathematical mode...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018